Cross-modal prediction in audio-visual communication

نویسندگان

  • Ram Rao
  • Tsuhan Chen
چکیده

Ram R. Rao Georgia Institute of Technology Atlanta, GA 30332 [email protected] Tsuhan Chen AT&T Bell Laboratories Holmdel, NJ 07733 [email protected] ABSTRACT In this paper, we present a novel means for predicting the shape of a person's mouth from the corresponding speech signal and explore applications of this prediction to video coding. The prediction is accomplished by modeling the probability distribution of the audiovisual features by a Gaussian mixture density. The optimal estimate for the visual features given the acoustic features can then be computed using this probability distribution. The ability to predict a person's mouth shape from the corresponding audio leads to a number of interesting joint audio-video coding strategies. In the cross-modal predictive coding system described in this paper, a model-based video coder compares measured visual parameters with predicted visual parameters, and sends the di erence between the two to the receiver. Since the decoder also receives the acoustic data, it can form the prediction and then reconstruct the original parameters by adding the transmitted error signal.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration

There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI) users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influ...

متن کامل

Op-brai150076 1..6

Developmental vision is deemed to be necessary for the maturation of multisensory cortical circuits. Thus far, this has only been investigated in animal studies, which have shown that congenital visual deprivation markedly reduces the capability of neurons to integrate cross-modal inputs. The present study investigated the effect of transient congenital visual deprivation on the neural mechanis...

متن کامل

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES Onsets Coincidence for Cross- Modal Analysis

Cross-modal analysis offers information beyond one extracted from individual modalities. Consider a non-trivial scene, that includes several moving visual objects, of which some emit sounds. The scene is sensed by a camcorder having a single microphone. A task for audio-visual analysis is to assess the number of independent audio-associated visual objects (AVOs), pinpoint the AVOs’ spatial loca...

متن کامل

Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony

Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of...

متن کامل

Audiovisual quality integration for interactive communications

This paper investigates multi-modal aspects of audiovisual quality assessment for interactive communication services. It shows how perceived auditory and visual qualities integrate to an overall audiovisual quality perception in different experimental contexts. Two audiovisual experiments are presented and provide experimental data for the present analysis. First, two experimental contexts are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996